Polyhedral parallel code generation for CUDA

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AutoGPU : Automatic Generation of CUDA Kernel Code

Manual optimization of a CUDA kernel can be an arduous task, even for the simplest of kernels. The CUDA programming model is such that a high performance may only be achieved if memory accesses in the kernel follow certain patterns; further, fine-tuning of the kernel execution and loop configuration may result in a dramatic increase in performance. The number of possible such configurations mak...

متن کامل

Polyhedral Code Generation in the Real World

The polyhedral model is known to be a powerful framework to reason about high level loop transformations. Recent developments in optimizing compilers broke some generally accepted ideas about the limitations of this model. First, thanks to advances in dependence analysis for irregular access patterns, its applicability which was supposed to be limited to very simple loop nests has been extended...

متن کامل

On Code-Generation in the Polyhedral Model

Automatic parallelization in the polyhedral model is based on aane transformations from an original computation domain (iteration space) to a target space-time domain, often with a diierent transformation for each variable. Code generation is an often ignored step in this process that has a signiicant impact on the quality of the nal code. Previous code generation methods are based on loop spli...

متن کامل

Automatic C-to-CUDA Code Generation for Affine Programs

Graphics Processing Units (GPUs) offer tremendous computational power. CUDA (Compute Unified Device Architecture) provides a multi-threaded parallel programming model, facilitating high performance implementations of general-purpose computations. However, the explicitly managed memory hierarchy and multi-level parallel view make manual development of high-performance CUDA code rather complicate...

متن کامل

Interprocedural Transformations for Parallel Code Generation Interprocedural Transformations for Parallel Code Generation

We present a new approach that enables compiler optimization of procedure calls and loop nests containing procedure calls. We introduce two inter-procedural transformations that move loops across procedure boundaries, exposing them to traditional optimizations on loop nests. These transformations are incorporated into a code generation algorithm for a shared-memory multiprocessor. The code gene...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Architecture and Code Optimization

سال: 2013

ISSN: 1544-3566,1544-3973

DOI: 10.1145/2400682.2400713